karen: support-first prompt + pre-purchase scenarios#317
Closed
karen: support-first prompt + pre-purchase scenarios#317
Conversation
…e scenarios tournament synthesis: A's prompt structure (support-first, explicit intent signals, strong anti-interrogation) + B's skill extraction (sales-closer loaded on demand via flexus_fetch_skill) + fallback if skill unavailable. 3 new benchmark scenarios: plan comparison, just browsing, browsing-to-intent. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
backslash-quote inside single-quoted yaml string broke parser. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rication fixes 2 failing scenarios: - just_browsing: karen used product_catalog instead of flexus_vector_search, asked follow-ups in pure support mode - browsing_to_intent: fabricated links not grounded in KB Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…io, clean judge_instructions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Humberto feedback: don't tell the model "NEVER do X" when X was never instructed. Removed "NEVER interrogate", "Don't upsell", "NEVER push for a decision / manufacture urgency". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
humbertoyusta
approved these changes
Apr 17, 2026
| - Ground every recommendation in flexus_vector_search() results. No invented features. | ||
| - If they say "I'll think about it" or "let me check with my team" — that's a valid outcome. Offer to help later, resolve. | ||
|
|
||
| ## Sales-Assist (Only on Buying Intent) |
Contributor
There was a problem hiding this comment.
If we're splitting, and it's support bot mostly, do we really need to do the separation of support mode default and sales assist?
the only sales-only part of prompt is
When you detect buying intent: listen 70% talk 30%, clarify their problem, paint the outcome not features, handle objections honestly, offer a human when stuck.
which, I think, it does not justify the if this then support, if this then sell, just telling to answer should be fine, in only one mode, which is mostly support
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
Author
|
Superseded by #322 (sales extraction from Karen) which removed the Sales-Assist section, BANT, and C.L.O.S.E.R. entirely. The pre-purchase scenarios from this PR should be re-added separately if needed. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fibery #2403 — Karen Pre-purchase answers and recommendations quality.
very_limitedexpert prompt: support-first, sales-assist only on buying intentskills/sales-closer/SKILL.md(loaded on demand viaflexus_fetch_skill)Tournament Result
Produced via
/tournament(N=2). Candidate A (Opus, conservative reorder) scored 7.65, Candidate B (Sonnet, skill extraction) scored 7.05. Judge recommended SYNTHESIZE: A's prompt structure + scenarios, B's skill extraction.skills/sales-closer/SKILL.mdFiles
karen_prompts.pyskills/sales-closer/SKILL.mdvery_limited__pre_purchase_plan_comparison.yamlvery_limited__pre_purchase_just_browsing.yamlvery_limited__pre_purchase_browsing_to_intent.yamlBenchmark Results (staging, v1.2.231)
Scores by model (actual_rating / 10)
plan_comparisonsaas_cs_platform_short(regression)just_browsingbrowsing_to_intentKey findings
flexus_vector_searchcorrectly, no fabrication, no unnecessary follow-ups)browsing_to_intent(8) but failsjust_browsing(5) on tool selectionModel recommendation
Karen's
very_limitedexpert should use gpt-5.4 for customer-facing conversations. It follows support-first instructions and tool constraints most reliably. grok-4-1-fast-reasoning remains fine for thedefault(admin) expert where sales compliance is less critical.Test plan
🤖 Generated with Claude Code